Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rust runtime #1597

Merged
merged 10 commits into from
Oct 6, 2018
Merged

Add rust runtime #1597

merged 10 commits into from
Oct 6, 2018

Conversation

nhynes
Copy link
Member

@nhynes nhynes commented Aug 14, 2018

This PR adds a Rust runtime which is useful when creating statically linked TVM modules--either native or Wasm. Indeed, Rust has the best Wasm support. It's also good for SGX.

The current code supports both TVM and NNVM modules. The API around Tensor conversions to DLTensor is a bit awkward since DLTensor isn't owned, but it shouldn't be too hard to iron out.

There's one nit, though: the dshape field in the Tensor struct exists solely because TVM uses u64 as the shape_t even when on 32-bit platforms. I'll make a PR for this eventually...

@nhynes nhynes mentioned this pull request Aug 14, 2018
32 tasks
if let Some(q) = SGX_QUEUES.lock().unwrap().pop_front() {
ThreadPool::run_worker(q);
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function interacts with the C++ runtime dylib

pub(super) strides: Option<Vec<usize>>,
pub(super) byte_offset: isize,
pub(super) numel: usize,
pub(super) dshape: Vec<i64>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the dshape mentioned in the PR description. When the target arch is 32-bit, usize is 32 bits. SInce the TVM module wants a 64-bit shape array, there needs to be an owned copy stored somewhere (i.e. in dshape).

report_todo = "Never"
report_fixme = "Never"
ignore = []
verbose_diff = false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the default rustfmt.toml produced by rustfmt --dump-default-config

@tqchen
Copy link
Member

tqchen commented Aug 14, 2018

@jroesch @ehsanmok can you help review this?

@tqchen
Copy link
Member

tqchen commented Aug 14, 2018

Let us directly put it under tvm/rust

@ehsanmok
Copy link
Contributor

ehsanmok commented Aug 14, 2018

@tqchen absolutely!

First of all, I'd like to thank @nhynes for his initiation. Since three months ago, I've followed his code and learned a lot from his contribution. Well-done Nick!

Here're some questions that I need clarifications:

  1. What's the clear and correct way of compiling tvm-rs? I'm in Rust nightly 1.30 and cannot compile it? do I need to add/install xargo.toml and compile with that?

  2. What about non-cpu Rust runtime context? don't we want gpu support in Rust for example? I was under the impression that we want the API to be as close as other runtime APIs (Java, Js, Python) but this work doesn't do the job! I know that Rustlang non-cpu support itself is not yet as mature as other mentioned languages, but that doesn't justify to give up on what's possible now. What's wrong with having the exact runtime API support as Java (dylib) for example, as well? (this is exactly what I'm trying to complete in tvm-rust).
    @tqchen Could you enlighten me on the direction?

  3. There's no documentation and example repo for newcomers. In my opinion, this is separate from tutorial considering Rust audience with no ML related experience.

@tqchen
Copy link
Member

tqchen commented Aug 14, 2018

+1 for docs and comments, I would recommend having one example for users to get started, let us make sure all the public API are documented and commented, that is what we need reviews for :)

I also created an issue to follow the discussion #1601

@tqchen tqchen mentioned this pull request Aug 14, 2018
@ehsanmok
Copy link
Contributor

ehsanmok commented Aug 14, 2018

@tqchen yes, it was confusing to me that this PR adds more and is different from what already exists for "runtime frontend" support that I had in mind which I thought is the intention of v0.5 roadmap.

@nhynes
Copy link
Member Author

nhynes commented Aug 14, 2018

@ehsanmok

What's the clear and correct way of compiling tvm-rs?

cargo build. The error you were getting before was due to the build.rs being ignored by git.

What's wrong with having the exact runtime API support as Java

Nothing at all. That's what we're working on: a unified Rust crate-of-crates which supports both frontend and "backend" Rust. Currently, having a Rust backend is useful for the emerging technologies of Wasm and SGX. These are effectively CPU targets, so that's where the effort has gone.

documentation and example repo

The tests could easily be made into docs. I'll wait for @jroesch's reviews before diving into docs.

}
}

fn tensor_from_array_storage<'a, 's, T, D: ndarray::Dimension>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to make these static methods?

rust/src/runtime/array.rs Show resolved Hide resolved
DLDataTypeCode_kDLFloat, DLDataTypeCode_kDLInt, DLDataTypeCode_kDLUInt, DLTensor,
};

const NDARRAY_MAGIC: u64 = 0xDD5E40F096B4A13F; // Magic number for NDArray file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need a deeper explanation.

@jroesch
Copy link
Member

jroesch commented Aug 15, 2018

Just did a quick pass before bed, code looks good, but could use lots of docs explaining it. I think it would be very hard for someone who didn't know the existing runtime to understand. I'll take another stab tomorrow or after you do docs? just let me know what would be more useful.

@nhynes
Copy link
Member Author

nhynes commented Aug 15, 2018

thanks for your review @jroesch! In the interest of making efficient use of time, I'll make a docs pass before you do further review.

authors = ["Nick Hynes <nhynes@berkeley.edu>"]

[features]
par-launch-alloc = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has #1226 been resolved already? if not, then I don't think it's a good idea to put it as a feature and maybe in a separate branch until it's resolved!

const DEFAULT_ALIGN_BYTES: usize = 4;

#[derive(PartialEq, Eq)]
pub struct Allocation {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it better to impl Alloc for it? or at least add the stabilized #[global_allocator] attribute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't actually an allocator--it's just a way to get bytes out of the global allocator in a less-unstable way.

pub(super) shape: Vec<usize>,
pub(super) strides: Option<Vec<usize>>,
pub(super) byte_offset: isize,
pub(super) numel: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be better to rename numel to size which is now the standard definition in ndarray notion of both numpy and bluss's Rust ndarray.

}
}

impl<'a, 't> TryFrom<&'a Tensor<'t>> for ndarray::ArrayD<f32> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wonder about other numeric types, f64 or ints?

fn get_function<S: AsRef<str>>(&self, name: S) -> Option<PackedFunc>;
}

pub struct SystemLibModule {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty struct with no {}, maybe?

pub type PackedFunc = Box<Fn(&[TVMArgValue]) -> TVMRetValue>;

#[macro_export]
macro_rules! call_packed {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice and clever :)

impl_prim_ret_value!(u32, 1);
impl_prim_ret_value!(f32, 2);
impl_boxed_ret_value!(String, 11);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other primitives, perhaps?

}

impl<'a> Tensor<'a> {
pub fn shape(&self) -> Vec<usize> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlinable with #[inline] and the like methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally don't use #[inline] because 1) I'm not smarter than the compiler and 2) the end user can always use link-time optimization (ref).

Also keep in mind that fast binary is not always as important as size: in Wasm, for instance, small binaries download faster. Similarly in SGX, smaller binaries affords more enclave memory for application use. In any case, LTO is unconditionally more useful than inlining.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially agree! while it's debatable, my reference is bluss's ndarray and the like libs he's wrote with careful inlining. I'm not familiar with wasm though.

Copy link
Member Author

@nhynes nhynes Aug 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the links! I'll look into them. The major source of (anything in fact and) inlining is in std, for example. I think, if in doubt in general, then skip it for now.

}

impl DataType {
fn itemsize(&self) -> usize {
Copy link
Contributor

@ehsanmok ehsanmok Aug 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same #[inline(always)]. (Not repeating for the like)

@nhynes
Copy link
Member Author

nhynes commented Aug 20, 2018

@jroesch I added some docs. Hopefully they're sufficient for your critical review.

@ehsanmok Thanks for the feedback. I addressed most of it. I'll attend to the other comments when we're closer to merging.

@nhynes
Copy link
Member Author

nhynes commented Aug 22, 2018

Rendered docs

@ehsanmok
Copy link
Contributor

ehsanmok commented Aug 22, 2018

@nhynes thanks! a couple of points/questions:

  1. Besides adding a separate example repo with a couple of end-to-end use cases (cleaner version of integrations tests maybe?), I suggest adding more docs specially in src/lib.rs describing what tvm is, links to tvm.ai, more tutorials etc. and what this runtime lib provides.

  2. For cleaner rendering of docs, I suggest to not include the dependencies and/or use #[doc(hidden)] to control the visibilities of some parts of this lib. For example, consider hiding the constants in tvm::ffi::runtime.

  3. Question: (to @tqchen as well) are we considering to support rust as a backend replacement of c++ ever? if so, I hardly think it'd be a good plausible idea and if not, then shall we get rid of the separate tvm::runtime module namespace in this PR and simply put everything there in the root lib?

@tqchen
Copy link
Member

tqchen commented Aug 22, 2018

I think rust version of runtime has its own merit, mainly for support wasm support(seems rust support is better than c++), so it is good to enable this option for users who need it

@ehsanmok
Copy link
Contributor

@tqchen let me rephrase my question. Wouldn't it be clear from the context of the library that rust will provide runtime support? in relation to #1601, I'm thinking of three types of API supports: 1) compiler backend: c++ (and python for prototyping) 2) runtime frontend: python, java, js, etc. 3) runtime backend (this PR) and frontend (my work): rust. So rust is not going to be in 1), right?
Then by default, tvm/rust is about runtime whether backend (wasm etc.) or frontend. That's why I'm asking to simplify the module namespace and place everything in tvm::runtime as simply in root tvm.

@nhynes
Copy link
Member Author

nhynes commented Aug 22, 2018

place everything in tvm::runtime as simply in root tvm

What you're working on isn't a runtime, though. It'd still go in the tvm/rust directory but have a different crate. We could have rust/tvm/runtime be tvm-runtime and your frontend, simply tvm.

rust compiler backend

Not until rust has good python FFI support. That's the main benefit of C[++]

@nhynes nhynes mentioned this pull request Sep 24, 2018
@tqchen
Copy link
Member

tqchen commented Sep 25, 2018

@ehsanmok @jroesch @nhynes it is a good time to revive this thread, let us add follow up comments and work to merge this in

@ehsanmok
Copy link
Contributor

@tqchen the last number of changes addressed some of my concerns and comments above. Besides improving docs and some other smaller things (like #[inline] attributes) that can be done later, my main concern is related to #1226 and seems allowing it in the build. If @nhynes thinks it was already resolved, then I think there shouldn't be a major blocking issue to merge. If not, then at least we can remove the feature and fix it later!

@nhynes
Copy link
Member Author

nhynes commented Sep 25, 2018

my main concern is related to #1226 and seems allowing it in the build

I haven't looped around to fixing that, but since the rust runtime is mostly useful for Wasm (single threaded) and SGX (which doesn't manage its own threads), it's not a blocker.

@ehsanmok
Copy link
Contributor

@nhynes Ok! So we shouldn't allow the related par-launch-alloc feature in master until it is resolved.

@nhynes
Copy link
Member Author

nhynes commented Sep 26, 2018

Actually, I just fixed the bug. Turns out the issue was that the temporal workspaces only work correctly if they have the same alignment as the dtype they're supposed to hold. Go figure.

@tqchen
Copy link
Member

tqchen commented Oct 2, 2018

Let us push to get this in, @nhynes please send a separate PR to upstream to update the Dockerfile.demo_cpu with necessary rust env.

@nhynes I take a quick look, although I am not rust expert, it seems to me that the documentation are still sparse. Please at least document

  • The tvm runtime module (e.g. This is a rust implementation of tvm runtime, can be used for SGX, WASM etc...)
  • The user-facing API functions (borrow docs from TVM API)
  • Some of the subtle implementation points that @jroesch mentioned.

After this is done and test get passed we can merge it in and followup with @ehsanmok on rust frontend

@tqchen tqchen added status: need update need update based on feedbacks and removed status: need review labels Oct 2, 2018
@nhynes
Copy link
Member Author

nhynes commented Oct 2, 2018

documentation are still sparse

The convention is to use the auto-generated docs with manual docs only when necessary. Here's a link to the current docs. It's also populated with examples.

I'll add a top-level module doc, add a dockerfile config, and address those code review comments.

@nhynes
Copy link
Member Author

nhynes commented Oct 2, 2018

Dockerfile.demo_cpu

should I also update CI? Rust builds might be a bit flaky since this crate needs to use nightly, though

@tqchen
Copy link
Member

tqchen commented Oct 4, 2018

Oh, yes, I meant to say ci_cpu. Using nightly is fine. However, note that we build the docker image infrequently, which means we will be stuck to a certain version of nightly. Using a stable version or hashtag is preferred

@tqchen
Copy link
Member

tqchen commented Oct 5, 2018

Please add rust testcases to Jenkinsfile to here https://github.com/dmlc/tvm/blob/master/Jenkinsfile#L141

@nhynes
Copy link
Member Author

nhynes commented Oct 6, 2018

the testcases are added in nhynes#1
I'll make a PR for that here when this branch and #1825 are merged

@tqchen tqchen merged commit 5563b72 into apache:master Oct 6, 2018
@tqchen tqchen added status: accepted and removed status: need update need update based on feedbacks labels Oct 6, 2018
@tqchen
Copy link
Member

tqchen commented Oct 6, 2018

Thanks @jroesch @nhynes @ehsanmok , this is now merged

@nhynes nhynes deleted the rust-runtime branch October 12, 2018 05:33
FrozenGene pushed a commit to FrozenGene/tvm that referenced this pull request Dec 27, 2018
@ZihengJiang ZihengJiang mentioned this pull request Feb 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants